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(57)Abstract: 

PROBLEM TO BE SOLVED: To improve recognition accuracy in the 
case of performing non- parametric pattern identification while 
suppressing a memory capacity for storing a reference pattern even 
in the case that k of k nearest neighbor is >3. 

SOLUTION: The editing processing of a recognition dictionary 15 for 
eliminating a pattern separated from an identification boundary is 
performed by a recognition dictionary management part 16 and 
-***** A characters are accurately recognized by high dimensionally projected 
local linear identification using a Kernel trick by a recognition 
processing part 14. 
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1. This document has been translated by computer. So the translation may not reflect the original precisely. 

2. **** shows the word which can not be translated. 
3.1n the drawings, any words are not translated. 



CLAIMS 



[Claim(s)] 

[Claim 1]A pattern recognition device which judges to which category an input pattern characterized by 
comprising the following belongs based on a recognition dictionary, and performs pattern recognition of said 
input pattern. 

A recognition dictionary which classifies and memorizes two or more reference patterns for every category. 
An editing means to delete a reference pattern which is distant from a discrimination border between 
categories among reference patterns in said recognition dictionary. 

A recognition means by which partial linearity discernment performs pattern recognition based on a 
recognition dictionary which deleted a reference pattern which is distant from a discrimination border by 
said editing means. 

[Claim 2]The pattern recognition device according to claim 1, wherein said recognition means performs 
pattern recognition by partial linearity discernment in high order former space which mapped an original 
feature vector of an input pattern nonlinear. 

[Claim 3]The pattern recognition device according to claim 1 or 2, wherein said recognition means makes a 
discriminant function a Gaussian kernel which maintains a relation of Euclidean distance in original 
identification space in high order former space of a map place. 

[Claim 4]In a pattern recognition method which judges to which category an input pattern belongs based on 
a recognition dictionary, and performs pattern recognition of said input pattern, An editing process of 
deleting a reference pattern which is distant from a discrimination border between categories among 
reference patterns in a recognition dictionary which classifies and memorizes two or more reference 
patterns for every category, A pattern recognition method including a recognition step which performs 
pattern recognition by partial linearity discernment based on a recognition dictionary which deleted a 
reference pattern which is distant from a discrimination border by said editing process. 

[Claim 5]The pattern recognition method according to claim 4, wherein said recognition step performs 
pattern recognition by partial linearity discernment in high order former space which mapped an original 
feature vector of an input pattern nonlinear. 

[Claim 6]The pattern recognition method according to claim 4 or 5, wherein said recognition step makes a 
discriminant function a Gaussian kernel which maintains a relation of Euclidean distance in original 
identification space in high order former space of a map place. 

[Claim 7]A program making a computer perform a method indicated to said claims 4-6 and in which 
computer reading is possible. 
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DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Field of the Invention]A pattern recognition device which this invention judges to which category an input 
pattern belongs based on a recognition dictionary, and performs the pattern recognition of an input pattern, 
About the program which makes a computer perform a pattern recognition method and a method for the 
same, especially, even if it is a case where k by the side of these days [ k ] is three or more, It is related with 
the program which makes a computer perform a pattern recognition device which can raise the recognition 
precision in the case of performing nonparametric pattern recognition, a pattern recognition method, and a 
method for the same, controlling the memory space for memorizing a reference pattern. 
[0002] 

[Description of the Prior Art]Conventionally the thing near an input pattern from a set of sample patterns k 
piece selection, The pattern recognition art called k these days side discernment which determines a 
classification of an input pattern based on those labels it has is known, and the conventional technology 
which especially aims at improvement in the speed of processing and improvement in recognition precision 
these days is known. 

[0003] However, since the problem of the storage capacity of a reference pattern and the problem of 
recognition precision still exist even if it uses such conventional technologies, this applicant, By constituting 
so that the reference pattern which is distant from the discrimination border between categories among the 
reference patterns in a recognition dictionary may be deleted in the application for patent No. 347272 [ 2000 
to ], It is supposed that the recognition precision in the case of performing nonparametric pattern 
recognition will be raised, controlling the memory space for memorizing a reference pattern. 
[0004]Are a discriminant function based on variable kernel density presumption, and bandwidth sigmaj is 
specifically set as the fixed multiple of these days side distance with a different category, When the 
dimension of nickel and a pattern is set to d for the number of reference patterns by the side of these days 
[ k], the feature is at the point of omitting 1 /of weighting-factors nickel-sigma j~d of the kernel in the strict 
variable kernel density presuming method. 

[0005]Since according to this advanced technology the curved surface to which the middle point of two 
neighborhood patterns in which categories differ mutually is connected serves as a discrimination border 
when k by the side of these days [ k ] is 2, the result which may mean that generalization capability becomes 
high is obtained. 
[0006] 

[Problem(s) to be Solved by the Invention]However, according to this advanced technology, when k is three 
or more, there is a problem whether a desirable result is obtained and that it becomes indefinite. Since it is 
used by k> 2 in many cases when actually performing character recognition, even if it is a case where it is 
more than 3 these days side, it is necessary to distinguish a similar character with sufficient accuracy. 
[0007]This invention is made in order to solve the problem by the above-mentioned conventional 
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technology, and it is a thing. 

Controlling the memory space for memorizing a reference pattern, even if the purpose is a case where k is 
three or more. It is providing the program which makes a computer perform a pattern recognition device 
which can raise the recognition precision in the case of performing nonparametric pattern recognition, a 
pattern recognition method, and a method for the same. 

[0008] 

[Means for Solving the Problem]In order for this invention to solve a technical problem mentioned above and 
to attain the purpose, a pattern recognition device concerning an invention of claim 1 is characterized by 
that a pattern recognition device which judges to which category an input pattern belongs based on a 
recognition dictionary, and performs pattern recognition of said input pattern comprises: 
A recognition dictionary which classifies and memorizes two or more reference patterns for every category. 
An editing means to delete a reference pattern which is distant from a discrimination border between 
categories among reference patterns in said recognition dictionary. 

A recognition means by which partial linearity discernment performs pattern recognition based on a 
recognition dictionary which deleted a reference pattern which is distant from a discrimination border by 
said editing means. 

[0009]A pattern recognition device concerning an invention of claim 2 performs pattern recognition in an 
invention of claim 1 by partial linearity discernment in high order former space where said recognition means 
mapped an original feature vector of an input pattern nonlinear. 

[0010]Let a Gaussian kernel in which a pattern recognition device concerning an invention of claim 3 
maintains a relation of Euclidean distance [ in / in said recognition means / original identification space ] in 
an invention of claim 1 or 2 in high order former space of a map place be a discriminant function. 
[001 1]A pattern recognition method concerning an invention of claim 4, In a pattern recognition method 
which judges to which category an input pattern belongs based on a recognition dictionary, and performs 
pattern recognition of said input pattern, An editing process of deleting a reference pattern which is distant 
from a discrimination border between categories among reference patterns in a recognition dictionary which 
classifies and memorizes two or more reference patterns for every category, A recognition step which 
performs pattern recognition by partial linearity discernment based on a recognition dictionary which 
deleted a reference pattern which is distant from a discrimination border by said editing process was 
included. 

[001 2]A pattern recognition method concerning an invention of claim 5 performs pattern recognition in an 
invention of claim 4 by partial linearity discernment in high order former space where said recognition step 
mapped an original feature vector of an input pattern nonlinear. 

[001 3]A pattern recognition method concerning an invention of claim 6 makes a discriminant function a 
Gaussian kernel which maintains a relation of Euclidean distance [ in / in said recognition step / original 
identification space ] in high order former space of a map place in an invention of claim 4 or 5. 
[0014]By making a computer perform a method indicated to any one of the claims 4-6, machinery reading of 
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the program of a program concerning an invention of claim 7 becomes possible, and it can realize any one 

operation of claims 4-6 by computer this. 

[0015] 

[Embodiment of the Invention]The suitable embodiment of a program which makes a computer perform a 
pattern recognition device applied to this invention with reference to an accompanying drawing below, a 
pattern recognition method, and a method for the same is described in detail. Suppose that the case where 
this invention is applied to a character reader is shown in this embodiment. 

[001 6](Composition of a character reader) The composition of the character reader concerning this 
embodiment is explained first. Drawing 1 is a functional block diagram showing the composition of the 
character reader concerning this embodiment. The character reader shown in the figure approximates a 
nonlinear class boundary by adopting the partial linearity discernment mentioned later in a classification 
hyperplane while performing editing of a dictionary. Partial linearity discernment in the high order former 
space (infinite dimension) mapped nonlinear is performed by adopting the technique called a kernel trick. 
[0017]As shown in the figure, this character reader 10 serves as the image input part 1 1, the pretreatment 
part 12, the feature extraction part 13, the recognizing processing part 14, and the recognition dictionary 15 
from the recognition dictionary Management Department 16. 

[0018]Here, the recognition dictionary of a claim is equivalent to the recognition dictionary 15, the editing 
means of claim 1 corresponds to the recognition dictionary Management Department 16, and a recognition 
means corresponds to the recognizing processing part 14. 

[0019]The image input parts 1 1 are input devices, such as a scanner which reads the picture of a character 
optically, and the image data read by this image input part 1 1 is outputted to the pretreatment part 12. 
[0020]The pretreatment part 1 2 is a treating part which pretreats the image data received from the image 
input part 1 1, and specifically, After carrying out binarization of this with a predetermined threshold, 
acquiring a binary picture, after carrying out data smoothing of this image data and removing a noise, and 
starting a character from this binary picture, it is a treating part which normalizes this. 
[0021 ]The feature extraction part 13 is a treating part which extracts characteristic quantity from the 
normalized alphabetic data in which the pretreatment part 12 pretreated, will carry out the mesh rate of the 
image data of a character to 5x5 etc., and, specifically, will search for the direction of the outline about each 
mesh. For example, when the direction of an outline is made into eight directions, a 5x5x8=200 dimension 
feature space will be formed. 

[0022]The recognizing processing part 14 is a treating part which judges to which category an input 
character belongs, and performs pattern recognition based on the decision result by comparing the 
characteristic quantity extracted from input characters, such as a handwritten character, with the 
characteristic quantity in the recognition dictionary 15 prepared beforehand. 

[0023]Specifically by this recognizing processing part 14, partial linearity discernment in the high order 
former space (infinite dimension) mapped nonlinear is performed by adopting a kernel trick. Explanation of 
this kernel trick and partial linearity discernment is mentioned later. 

[0024]The recognition dictionary 15 is a dictionary used for recognition of the input character by the 
recognizing processing part 14, matches a category for every character and, specifically, memorizes the 
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characteristic quantity (referred data) of a character for this every category. 

[0025]The recognition dictionary Management Department 16 is creation and management of the 
recognition dictionary 15 a treating part to perform, and specifically, As it makes a bandwidth variable and 
can set up a discriminant function finely, this recognition dictionary Management Department 16 is raising 
recognition precision, while it reduces the capacity of the recognition dictionary 14 by performing editing 
(editing) processing in which the pattern which is distant from a discrimination border is deleted. 
[0026]The editing processing by the recognition dictionary Management Department 16 which showed (the 
concept of editing processing), next drawing 1 is explained concretely. Drawing 2 is an explanatory view 
showing an example of distribution of the two-dimensional reference pattern belonging to two sorts of 
categories, and drawing 3 is the explanatory view which established the discrimination border in distribution 
of the reference pattern shown in drawing 2 . 

[0027]general — the discernment technique — (1) — the parametric discernment technique and (2) — 
being classifiable into the nonparametric discernment technique — (1) — to the parametric discernment 
technique, there are secondary discernment etc. from which the linearity discernment from which a 
discrimination border serves as a hyperplane, and a discrimination border serve as secondary hypersurface 
— (2) — there are a PAZEN classifier with the discrimination border by which smoothness was carried out 
to the these days side discernment from which a category is separated by the tattered noy boundary, etc. in 
a nonparametric discrimination border. 

[0028]The reference pattern group which belongs to the category A shown with the small rectangle in a 
figure as shown in drawing 2 , Considering the case where the reference pattern group belonging to the 
category B shown with a big rectangle in a figure exists, and the reference pattern group of the category A 
is inserted into the reference pattern group belonging to the category B. The discrimination border or the 
nonparametric discrimination border by which smooth was carried out of secondary discernment as shown 
in drawing 3 is formed. 

[0029]Thus, although a category is discriminable by using the conventional nonparametric discrimination 
border, if the conventional discrimination border is using as it is, the number of reference patterns which 
must be memorized to the recognition dictionary 15 will increase. For this reason, at the recognition 
dictionary Management Department 16, editing processing is performed and the number of reference 
patterns is reduced. 

[0030]Next, the concept of the editing processing performed by this recognition dictionary Management 
Department 16 is explained still in detail as compared with a PAZEN classifier. If the frequency function of 
par ZENU Indaw makes N and a kernel function K d [-] and sets [ d dimensional data ] a bandwidth to h for s ; 
and a data number, [Equation 1] 



** — it becomes like. For this reason, if this K d [-] and h are chosen appropriately, p n (x) will be converged 
on the probability density distribution of x. 

[0031]The necessary condition in this case becomes K d , >=0 integralK d , [and] 
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dx=1 lim N ->infinityh=Olim N ->infinityNh d =infinity. 

[0032]If a regular matrix of dxd is set to H and an upper type is made more into a general form here, 
[Equation 2] 

p(x)= 4fl2. N =i KdPI " 1(x - Si)1 " (2) 

It becomes. |H| shall mean the absolute value of the determinant of H. 
[0033]And when a gauss kernel is used, it is about (1) type, [Equation 3] 

P(X) = N X'^^**- 1 ^ - (3) 

It becomes. 

[0034]About (2) types[Equation 4] 

It becomes. However, sigma is taken as a sample covariance procession. 

[0035]And considering the case where a PAZEN classifier is used directly, it is a point estimate of the 
probability density for every category. [Equation 5] 

(5) 

It will be considered as a discriminated result with w, used as ******. 

[0036] Drawing 4 is an explanatory view for explaining a discernment concept at the time of using a PAZEN 
classifier for one-dimensional data. Data shown by O in a figure The normal distribution N of the an average 
of 190 standard deviation 30 (190, 30 2 ). Data which is used as data with distribution which mixed the normal 
distribution N of the an average of 380 standard deviation 30 (380, 30 2 ) by two in eight pairs generated 
artificially, and is shown by ** in a figure, It is considered as artificial data with distribution which mixed the 
normal distribution N of the an average of 230 standard deviation 60 (230, 60 2 ), and the normal distribution 
N of the an average of 330 standard deviation 10 (330, 10 2 ) by 6 to 4. The number of data for every 
category may be ten pieces respectively. 

[0037]. And fix standard deviation to a value averaged with the mixing ratio as a preset value of a bandwidth 
about mixed distribution. That is, when a bandwidth of the category A is set as (30x8+30x2) / 10=30 and a 
bandwidth of the category B is set as (60x6+10x4) / 10=40, a density function respectively presumed using 
ten data becomes a curve shown in the figure (a). 

[0038] When a bandwidth is fixed for every distribution, namely, a bandwidth of the category A, If a bandwidth 
of 30 and the category B is set as 10 to data of 60 and (4) to data of (3) to data of 30 and (2) to data of (1), 
it will become like a curve shown in the figure (b). a case where certain x is given here — this — it will be 
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judged as that to which x belongs to a big category of a frequency function of x. 

[0039]Thus, although it is discriminable using a PAZEN classifier, according to this PAZEN classifier, a 
problem that data of big N is needed graduated by the increase in d called a curse of a dimension, and a 
problem that a bandwidth is immobilization arise. 

[0040]So, at the recognition dictionary Management Department 16 concerning this embodiment, while 
erasing h^lsigmaj 172 which is a paragraph of a denominator of (5) types, a measure which makes a bandwidth 
variable is taken. Although posterior probability is computed based on density presumption by a gauss 
kernel, specifically, 1/of density normalization paragraph h d will be then disregarded, using sigma common to 
every category. 

[0041 ]If C is made into the number of categories, it is the posterior probability of category w„ [Equation 6] 



It becomes. 

[0042]lt is about discriminant function g;(x) to the reference pattern which carried out editing so that it 
might leave near a category boundary here. [Equation 7] 



It carries out and is set as the fixed multiple of the shortest distance with the whole different category 
pattern. Under the present circumstances, even if it is h ik d /h jk d !=1 , the discrimination precision it is higher to 
disregard 1/h jk d is acquired. 

[0044]Next, editing procedure by the recognition dictionary Management Department 16 which showed 
drawing 1 is explained. Drawing 5 is a flow chart which shows editing procedure by the recognition dictionary 
Management Department 16 which showed drawing 1 . 

[0045]As shown in the figure, at this recognition dictionary Management Department 16, it is considered as 
set B[ of a sample chosen ] = {all the samples}, all checked CFLG[x]s given to the element x of B are made 
the OFF (OFF), and number of search r=10k and initialization to set will be performed soon (Step S501). 






It carries out. 

[0043] Bandwidth h ik of the k-th reference pattern of the category i [Equation 8] 
h ik «Knrin|s ik -s jm | ... ( 8 ) 
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[0046]Then, if this number r of neighborhood search is not more than k about the number r of search soon 
as compared with k' (Step S502) (step S502 denial), processing will be ended as it is, When the number r of 
search is more than k soon, the one sample x is extracted at random out of what is (Step S502 Affirmation) 
and CFLG[x]=OFF of the set B (Step S503). 

[0047]And it is checked whether all near [ r ] x are the same as a category of x (Step S504). An operation 
which revolution which approximates an envelope of pattern distribution belonging to one category 
separates from a discrimination border in the direction of an inside of distribution, so that this k' is large, and 
smooths a discrimination border will be strengthened. 

[0048]As a result, when all near [ r ] x are the same as a category of x, after updating (Step S504 
Affirmation) and B to B-{x}, returning all CFLG(s) at OFF and setting counted value count to zero, it shifts to 
(Step S505) and Step S503. 

[0049]On the other hand, when at least one near [ r ] x is not the same as a category of x, (Step S504 
Denial) and CFLG[x] are made the one (ON), After **************ing counted value count, (Step S506), It 
investigates whether this counted value count is more than number |B| of a set (Step S507), and when 
counted value count is not more than number |B| of a set, it shifts to (Step S507 Denial) and Step S503. 
[0050]On the other hand, when counted value count is more than number |B| of a set, it is referred to as 
(Step S507 Affirmation) and r=r~delta r, all CFLG(s) are returned at OFF, and after setting counted value 
count to zero, it shifts to (Step S508) and Step S502. 

[0051 ]By performing a series of above-mentioned editing processings, the recognition dictionary 
Management Department 16 can delete and have a reference pattern which is distant from a discrimination 
border, and capacity of a recognition dictionary can be reduced. 

[0052] Drawing 6 is an explanatory view for explaining a reduction process of a reference pattern by the 
recognition dictionary Management Department 16. If editing processing is applied using a terminating 
condition that a pattern with which a category is certainly mutually different is contained, to five k -5, i.e., 
every neighborhood, when 200 samples per each category shown in the figure (a) exist, it will become as it is 
shown in the figure (b). 

[0053]And if editing processing is applied using a terminating condition that a pattern with which a category 
is certainly mutually different is contained, to four k -4, i.e., every neighborhood, it will become as it is shown 
in the figure (c), If editing processing is similarly applied using a terminating condition of k -3, it will become 
as it is shown in the figure (d). 

[0054] As shown in these figures, if this editing processing is performed, a reference pattern near a boundary 
will remain, but reference patterns of a portion which are distant from a boundary will be reduced. 
[0055]Next, an effect of bandwidth change by the recognition dictionary Management Department 16 is 
explained concretely. Drawing 7 is an explanatory view for explaining an example of an effect of bandwidth 
change by the recognition dictionary Management Department 16. 

[0056]Data shown by O in a figure like drawing 4 , The normal distribution N of the an average of 190 
standard deviation 30 (190, 30 2 ). Data which is used as data with distribution which mixed the normal 
distribution N of the an average of 380 standard deviation 30 (380, 30 2 ) by two in eight pairs generated 
artificially, and is shown by ** in a figure, It is considered as artificial data with distribution which mixed the 
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normal distribution N of the an average of 230 standard deviation 60 (230, 60 2 ), and the normal distribution 
N of the an average of 330 standard deviation 10 (330, 10 2 ) by 6 to 4. The number of data for every 
category may be ten pieces respectively. 

[0057]As six errors arise when a bandwidth is fixed about mixed distribution (the category A; 30, the 
category B; 40), as shown in the figure (a), and shown in the figure (b), It became five errors when a 
bandwidth was fixed for every distribution (the category A; 30, 30 and the category B60, and 10). 
[0058]On the other hand, as shown in the figure (c), when a bandwidth was made into these days side 
distance with a different category, it became three errors and the error number decreased. As shown in the 
figure (d), when editing processing was performed, it became the two error number. A reason which the error 
number reduces in this case is that it can form a discrimination border with two categories which face finely. 
[0059]Next, it is two categories and a case where it is simplified k these days two pieces side is explained. A 
bandwidth shall be a fixed multiple of shortest distance minlls-sjl (however, categories of s, and Sj differ) 
with a different category pattern. 

[0060]In this case, posterior probability of category w t shown by (6) formulas, [Equation 9] 
P(w,|x) . I . — ... (9) 

It becomes. It is thought that s, and s 2 which are the pairs of a pattern soon are h^h^kappaJIs-sJI. 
[0061]This sake, [Equation 10] 

exp{ -J^L }+exp{ .^L } 

When a next door and x are the middle point (s t +s 2 )/2 of s 1 and s 2 , as shown in drawing 8 , it is set to P(w 1 |m) 
=P(w 2 |m) =1/2, and a discrimination border will pass along the middle point of St and s 2 . 
[0062](The concept of recognition processing), next the processing concept of the recognizing processing 
part 14 shown in drawing 1 are explained. This recognizing processing part 14 is performing partial linearity 
discernment in the high order former space (infinite dimension) mapped nonlinear by adopting the partial 
linearity discernment which performs linearity discernment using the local reference pattern which carries 
out the whereabouts near the input data which is a recognition object, and the kernel trick mentioned later. 
[0063] Drawing 9 is an explanatory view for explaining partial linearity discernment which this recognizing 
processing part 14 performs. Here, a reference pattern belonging to the category A is illustrated with a small 
circle, and a reference pattern belonging to the category B is illustrated with a small rectangular head. 
[0064]As shown in the figure, when input data x used as a recognition object is inputted, a circle of the 
radius r centering on this input data x is considered to be a local domain, and local average m A and m B for 
every category are calculated. And a separating hyperplane which divides this local average into two equally 
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vertically is considered, and it identifies by to which of this separating hyperplane input data x carries out 
the whereabouts. 

[0065]Specifically, a judgment of this input data x is faced, [Equation 11] 
fOO-(m A -m B ) t (x - l ?A* m B) ... (1 1} 

f(x) to say is calculated, and if it is f(x)>0, input data x will judge with the thing belonging to the category A. 
[0066]Next, a support vector machine and a kernel trick are explained. So that it may be indicated as this 
support vector machine "for something to be ", Institute of Electronics, Information and Communication 
Engineers, June, 2000, and pp460-466 with Koji Tsuda and "support vector machine", After mapping a 
feature vector to high order former space by a certain nonlinear transformation, it is the technique of asking 
for a hyperplane which separates two categories (class) with linearity quadratic programming. A hyperplane 
for which it asks makes the maximum quantity of a margin which is the minimum of distance of a hyperplane 
and a training pattern among those which realize linearity separation, and is excellent in respect of 
generalization capability. In this support vector machine, the purpose mapped to high order former space is 
to make linearity separation easy, also when the number of training patterns increases, but. A technique 
which makes computational complexity small is used by replacing directly inner product calculation of a 
discriminant function in high order former space after a map with a kernel function, without calculating a 
map of a feature vector. This is called a kernel trick. 

[0067] Drawing 10 is an explanatory view for explaining a concept of a kernel trick. Since the data a which 
originally belongs to the category A, and the data b belonging to the category B cannot be classified in the 
discrimination border L1 when a discrimination border of the category A and the category B is complicated 
as shown in the figure, it cannot be judged to which of two categories input data belongs. 
[0068] However, if a number of dimension is increased, it will become easy to separate two categories in a 
straight line. For example, although a category is inseparable in the discrimination border L1 shown in the 
figure, if the discrimination border L2 made into high order origin is used, the data a and the data b are 
classifiable. Thus, in this kernel trick, it will ask for a hyperplane which separates two categories, after 
mapping a feature vector to high order former space. 

[0069]By the way, since an operation of a kernel is required by the number of a support vector, there is a 
problem that identification processing takes time in this support vector machine, but. Since it is sufficient if 
a gauss kernel will be calculated only to a pattern soon even if it uses this kernel trick, since editing is 
performed as this invention already explained, identification processing can be performed promptly. 
[0070]Next, it explains still more concretely about partial linearity discernment using a kernel trick which 
this recognizing processing part 14 performs. Drawing 1 1 is an explanatory view for explaining a concept of 
partial linearity discernment of having used a kernel trick. 

[0071]Linearity separation will become impossible when a boundary of a category is dramatically complex, if 
it is performing partial linearity discernment in an original feature space as shown in drawing 1 1 (a). For 
example, although a true category boundary winds like a wave when shown in the figure, since a local 
discrimination border is a straight line, an appropriate result is not obtained. 
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[0072JOn the other hand, since between the straight lines L3 and L4 will serve as a separator of a category 
if partial linearity discernment in high order former space which mapped an original feature vector nonlinear 
using a kernel trick is performed as shown in drawing 1 1 (b), linearity separation is attained by local linearity 
discernment. 

[0073]Next, a partial linear discriminant function in original feature-space R d of d dimension is explained 
concretely. However, suppose that covariance-matrix sigma, of the two categories 1 and 2 and sigma 2 are 
equal, and are a fixed multiple of an identity matrix here. 
[0074]Partial linear-discriminant-function f 12 (x) in this case, [Equation 12] 

f 12 (x) = (m I -m 2 ) t (x-^i^) 



1 n L 1 n 2 1 n l 1 n 2 



(1 2) 



If it is next door and f 12 (x)>0, input data x will be identified if it belongs to the category 1. 
[0075]However, local average m 1 and m 2 , It is the average of the neighborhood pattern which belongs to the 
categories 1 and 2, respectively, and x i; (i= 1, — , ni ) and x 2i (i= 1, — n 2 ) are neighborhood patterns which 
belong to the categories 1 and 2, respectively, [Equation 13] 

^ d k 2 >| x -*2i| — (1 3) 

********. d k is k these days side distance. 

[0076]Discriminant function f 12 [ in / when the these days side pattern in an original feature space chooses 
the map which becomes the same also in the high order former space of a map place / R d phi ] (phi(x)), 
[Equation 14] 

n i i=i n 2 1\ 2n i tti 2n 2 ft 

(1 4) 

It becomes. 

[0077]Since this discriminant function is expressed by linear combination of an inner product in R d phi, it can 
apply the technique of a kernel trick. That is, even if it does not actually perform calculation phi(x) of a map 
to high order origin, only calculation of a real valued function can be managed. 

[0078]That is, discriminant function f 12 (phi(x)) is by a kernel function which has a relation of K(x, y) 
=phi(x) t phi(y), [Equation 15] 

f, ! (* W ,-<i|K ( „ a) -i|K ( « 20 -^||K ( ,,,, lj) ^||K ( , 2 ,,, ij) 

■•' (1 5) 
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It becomes. 

[0079]The function called a Gaussian kernel[Equation 16] 

K(x,y)-exp(J^£) 
o 



(1 6) 



It is alike and corresponding phi holds a relation of Euclidean distance in original space also in space of a 
map place. [0080] Jam,[Equation 17] 

v|t(x)-«Ky)|p -^(x) t <Kx)~2K(x t y)^(Ky) c <Ky) = 2(1 - exp(-l^|£)) 

a 

(1 7) 



It ******. Therefore, k these days side pattern in an original feature space is k these days side pattern also 
in the space of a map place, phi corresponding to a Gaussian kernel will be mapped to the space of an 
infinite dimension. 

[0081]The partial linear discriminant function (at the time of sigma 1 !=sigma z ) in original feature-space R d of d 
dimension although the detailed explanation is omitted here, [Equation 18] 

f u00 -Xr" 1 * 1 "! -"^t*- 1111 ... (18) 



A method of Fischer (Fisher) shown in ******** drawing 12 can also be used. However, sigma T is all the 
covariance matrices of the categories 1 and 2. 

[0082]Next, procedure of the recognizing processing part 14 shown in drawing 1 is explained. Drawing 13 is 
a flow chart which shows procedure of the recognizing processing part 14 shown in drawing 1 . If this 
recognizing processing part 14 inputs feature vector x which is input data as shown in the figure (Step 
S1301), A these days side [ k pieces ] pattern is looked for from N reference patterns (Step S1302), and it 
is checked whether all these days side [ k pieces ] patterns belong to the same category C 0 (Step S1303). 
[0083]As a result, in belonging to the same category C 0 altogether, it recognizes it as a thing belonging to 
(Step S1 304 Affirmation) and category C 0 (Step S1 310). In not belonging to the same category C 0 altogether, 
on the other hand, (Step S1304 Denial), Top two category and C 2 are chosen (Step S1305), and a partial 
discriminant function using an already explained kernel trick is applied (Step S1306). 
[0084]When a value of this discriminant function is larger than 0, and (Step S1307 Affirmation), It is 
recognized as it being category (Step S1308), and when a value of a discriminant function is not larger 
than 0, it is recognized as their being (Step S1307 Denial) and category C 2 (Step S1309). 
[0085]As mentioned above, while performing editing processing of the recognition dictionary 15 in which a 
pattern which is distant from a discrimination border by the recognition dictionary Management Department 
16 is deleted according to this embodiment, Since it constituted so that the recognizing processing part 14 
might perform partial line type discernment using a kernel trick, Recognition precision in a case of 
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performing nonparametric pattern recognition can be raised controlling memory space for memorizing a 
reference pattern, even if it is a case where k by the side of these days [ k ] is three or more. 
[0086]Although this embodiment showed a case where this invention was applied to a character reader, this 
invention is not limited to this and can be applied to a various-patterns recognition device which recognizes 
a pattern. However, the numbers of categories, such as an English character, a number, and katakana, are 
effective about especially few things. About a Chinese character with many categories, it is effective as a 
means to realize detailed discrimination processing between specific similar categories. 
[0087] 

[Effect of the Invention]As explained above, according to the invention of claim 1, the reference pattern 
which is distant from the discrimination border between categories among the reference patterns in the 
recognition dictionary which classifies and memorizes two or more reference patterns for every category is 
deleted, Since it constituted so that partial linearity discernment might perform pattern recognition based 
on the recognition dictionary which deleted the reference pattern which is distant from a discrimination 
border, The effect that the pattern recognition device which can raise the recognition precision in the case 
of performing nonparametric pattern recognition is obtained is done so, controlling the memory space for 
memorizing a reference pattern, even if it is a case where k by the side of these days [ k ] is three or more. 
[0088]Since according to the invention of claim 2 it constituted so that partial linearity discernment in the 
high order former space which mapped the original feature vector of the input pattern nonlinear might 
perform pattern recognition, Even if it is a case where the discrimination border is complex, the effect that 
the pattern recognition device which can be recognized with sufficient accuracy is obtained is done so. 
[0089]Since according to the invention of claim 3 it constituted so that the Gaussian kernel which maintains 
the relation of the Euclidean distance in original identification space in the high order former space of a map 
place might be made into a discriminant function, The effect that the pattern recognition device which can 
be recognized efficiently without changing the reference pattern which carries out the whereabouts to the 
neighborhood before and after a map is obtained is done so. 

[0090]According to the invention of claim 4, the reference pattern which is distant from the discrimination 
border between categories among the reference patterns in the recognition dictionary which classifies and 
memorizes two or more reference patterns for every category is deleted, Since it constituted so that partial 
linearity discernment might perform pattern recognition based on the recognition dictionary which deleted 
the reference pattern which is distant from a discrimination border, The effect that the pattern recognition 
method which can raise the recognition precision in the case of performing nonparametric pattern 
recognition is acquired is done so, controlling the memory space for memorizing a reference pattern, even if 
it is a case where k by the side of these days [ k ] is three or more. 

[0091]Since according to the invention of claim 5 it constituted so that partial linearity discernment in the 
high order former space which mapped the original feature vector of the input pattern nonlinear might 
perform pattern recognition, Even if it is a case where the discrimination border is complex, the effect that 
the pattern recognition method which can be recognized with sufficient accuracy is acquired is done so. 
[0092]Since according to the invention of claim 6 it constituted so that the Gaussian kernel which maintains 
the relation of the Euclidean distance in original identification space in the high order former space of a map 
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place might be made into a discriminant function, The effect that the pattern recognition method which can 

be recognized efficiently without changing the reference pattern which carries out the whereabouts to the 
neighborhood before and after a map is acquired is done so. 

[0093]According to the invention of claim 7, by making a computer perform the method indicated to any one 
of the claims 4-6, machinery reading of the program becomes possible and a computer can realize any one 
operation of claims 4~6 by this. 
[Brief Description of the Drawings] 

[Drawing 1] lt is a functional block diagram showing the composition of the character reader concerning this 
embodiment of the invention. 

[Drawing 2] It is an explanatory view showing an example of distribution of the reference pattern belonging to 
two sorts of categories. 

[Drawing 3] It is the explanatory view which provided an example of the discrimination border in distribution 
of the reference pattern shown in drawing 2 . 

[Drawing 4] lt is an explanatory view for explaining the discernment concept at the time of using a PAZ EN 
classifier. 

[Drawing 5] It is a flow chart which shows the editing procedure by the recognition dictionary Management 
Department which showed drawing 1 . 

[Drawing 6]It is an explanatory view for explaining the reduction process of the reference pattern by the 
recognition dictionary Management Department which showed drawing 1 . 

[Drawing 7] It is an explanatory view for explaining an example of the effect of the bandwidth change by the 
recognition dictionary Management Department which showed drawing 1 . 

[Drawing 8] lt is an explanatory view for explaining the case where it is simplified two category and k these 
days two pieces side. 

[Drawing 9] It is an explanatory view for explaining the partial linearity discernment which the recognizing 
processing part shown in drawing 1 performs. 

[Drawing 10]It is an explanatory view for explaining the concept of a kernel trick. 

[Drawing 11] It is an explanatory view for explaining the concept of partial linearity discernment of having 
used the kernel trick. 

[Drawing 12]It is an explanatory view for explaining the method of Fischer (Fisher). 

[Drawing 13] lt is a flow chart which shows the procedure of the recognizing processing part shown in 

drawing 1 . 

[Description of Notations] 

10 Character reader 

1 1 Image input part 

12 Pretreatment part 

13 Feature extraction part 

14 Recognizing processing part 

15 Recognition dictionary 
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16 Recognition dictionary Management Department 
A and B Category 



[Translation done.] 
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[0 0 6 6] •9-#-h^^ir-v^>fc«ka f *- 
tifRjfr" , W?1*ffiHi^tt. 2000^6^, p P 460-4 

- *»(B^affl¥ffi*l»«2 35cW-Hffifc:«l;i} 

#**#Src*S 0 «BB»li*IIS|-r 40 

fci^T, iS^TESHfc^fc-rSBWM:. Slli*/<*— 
*\ ¥«»<DillRfc7KSIfl]fc: fctt s ffiSiJBOft® rtSGH-Jti 



[0 0 6 4] raBHcs-rcfcafc, EWWfcfcftsx*^ 

^RffTi^mA^cttfmB^AR^So fit, c^jgpjfip 

[0065] c<D\j3T—$ xiDmmcm 

C» i l ] 

■•■ (1 1) 

[0 o 6 7] H l Oli. *^*;l/h«Jy*©ftS:*SSW 
■TS^feOSttWHTfeSo InHaic^f <fc5fc:, *-r=fy 

t— £ # 2 0<D ^ df U — CD £ ^ 5 £JSt" £ T? 
[0 0 6 8] L^La^e, ^TciSf^ig^-Th, 2 0<E> 

[0 0 6 9] tC6T\ K^-7y>t 
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[0 0 7 0] c^ffiMJaagpi 4 3b^c^-5^- 
[0 0 7 1 ] mil (a) fcja^TcfcSfc:, K4S*fflM7f 

f 12 (x) = (m 1 - m2 )Vx-i^y^) 



[0 0 7 2] CftfcfcfLT* [Ull (b) fcjSfJ:? 

[0 0 7 3] dXfc(D^Mm^mR d lcl5l,fZ>m 
[0 0 7 4] C ^«^0«m«JB»g'JB8» f 12 ( x ) 

a, 

[»1 2] 



1 n l 1 n 2 1 n l 1 n 2 



(1 2) 



<hft^. f 12 (x) >0T*fentf. A2rr— *x**r- 20 ^ X n (i=l.-.ni) % x 2i (i=l,-,n 2 ) te\ 



dry — 1 fc/R-rsfciKgij-rs 

[0 0 7 5] f:/cU Wfl/WSmi, m 2 i±, ^ft^Fft* 

Ix-x^j <d k 2 ,|x-x 2i | 2 <d k 2 

^I/cto ft*s, dkttk*ififl}ffiltT?Sao 

[0 0 7 6] $fc, KW»£IHTOJlifi#^£->>b\ 
^* 5fe^iS^7t:^Ba £> 1 ^ T fe £ ft 3 ^ft^r 



n^-r dry — 1 fcJ:tf2lcE-rs5ftfl^*— 

[»1 3] 



(1 3) 



ORl 4] 



n i £1 n 2 fci 2n i {r! 2n 2 At 



£:ft£ 0 

[0 0 7 7] COBJglJBBfttt, R d ^tctsi/^TrtacD* 
(x) **IBfc*catoa<TfelB»*H»flDW-I|[«» 40 [ai 5] 



) 

... (14) 

[0 0 7 8] -Tftt)-&, K (x, y)=0(x) t 0 

(y) OUB«*»o*-*;W!i»fc:j:oT, ttSUIKHR f 
12 (0 (x) ) a. 



(1 5) 



^ft£o [Rl 6] 

[0 0 7 9] /f«>^7>*"-*;I/i:Ppfrfn5BBSi 



K(x,y)«exp( 




(16) 
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[0 0 8 0] [Rl ?] 

h ~ yJ 2 < I* -y 2 f * s « . |+(x) - 4>(y, | 2 < jKx) -<Ky 2 )| 2 

•-■h>(x)-4»(yf =4 > (x) , 4>(x)-2K(x,y)- t j ) (y) t 4<y) = 2(1 - exp(- I^A) 



(1 7) 



2tC^-r7^^5/-V— (Fisher) (Djj&Z 

[oo8 2] m i tc^LfcSMtfflaffi 1 4 o&lji 

mm l 4(0jfll#)l*St7n^+^b?«s o meg 

3£L (X^y^S 1 3 0 2), kffl©«jffif»^5r — >tf 
(Xf7^S 1 3 0 3) Q 

[0083] ^Trai:*-r=fu-cotc:K-r 

^ll^Cte (Xf7/S 1 3 0 4f© N 

ofcURf SfeOfcEBtr* (X^y^S 1 3 1 0) 0 c. 

ftt^*&lufcl: (Xf 77"S 1 3 0 4g^) , ifi2^ 
*"r^U-Ci £C 2 £riitKL (Xf'^S 1 3 0 5). 
f?»t:KWLft*^*;l/hiJ ^ **fflt^fcJB»f«B"Jia» 
^iS^-T^ (Xt^^S 1 3 0 6) o 

[0 0 8 4] * IT, C ©KBUMSOfifitf 0 J: D 
t/^i^^^ (X-r-y^S 1 3 0 7**) , tJ-rdTV-C 
iT&S£|8»L (Xf7^S 1 3 0 8) , NBUBOftO 
I^0J:H^t<4^|^C^ (Xf7/S 13 0 7 

1 3 0 9) 0 

[0 0 8 5] laLTtftiifc, **«OJBJBfccfcti 
eft 5 BK&lffii 4KU; D*— u y 

[0 0 8 6] ftfc, *^flS<E>JBffiT?fcl\ *»K*S:*B 



[0081] (ici-pji^oKaisarBira^e-rs 

Cftl 8] 

••• (1 8) 

[0 0 8 7] 

<£ fttf , tt»<D#MB> ^ - a x rf U - cTfc g # L 

TSBf fflJBWgUtc J: 0 * - VJK«*43 c a 5 «t 3 #f j£ 
LfeOT?. k«ifi$©k#3JW±0«^T*SoTfc % # 

y^7^bU7^^^- >«GiJ*fe c ft -5 * AQK 

[0 0 8 8] £fc, l»*flS20«WfcJ:ntf. 
©JBBrlBBIIB'Jfc: t> ^*->H«**5C ft 3 <fc a «ric 

ae a < BN-r ^> c fc ^ Rit^ft / < * - >emjik^# ^> n 

[0 0 8 9] »*^3 0SSWk:«fcntf. jHK8(j$ 
H«T*»^**»>^T>*-*;l/*ligiJ|HRi:-r*±a 

*^Sc Lfc(D~£\ ^m<Dm'& Tifift ^ Rff ft -r ^> #sa^ * * - 

>**^t e fcSfta|sfi<E«t-rScii:35,^Rrfi6ft/^*--vK 
[0 0 9 0] »*JH4oaBBfc: < fchtf, M^o# 

<g© k ^ 3 W±0«-&7?S o T fe , ^ ^ - Eta 

* ft^ # - >ISB'J^ c ft 3 »*OEMI#*±fcFa c 
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[0 0 9 1 ] £ft, W*«5 0»Kfc: ( tntf, Ajjs^Z 
fifi < ffi»"T ^Ci: pTfg&j < * - VffiBDJffi #?# 6 ft 

b ft (O T\ ¥fi <0 Wik -Vi&tinc ffi ft T -5 < £ - 

[0093] sfc, i««a7o»?Bfc:ctntf, is^iph 

[0 1] C(0^^)6tt(cffSj!$B^|«)| 
[0 2] 2 «£) * ir dT u - ^s-r 5 < # — xo^tu 

[si 3] 0 2 te^&m'i*—>(Dftir*i£m$mn<D— 

[Ei4] if v ^ ^ ^ 7 y 7 ft m 1/ > ft W^OlftglJW 
AftSiBB-rSftft^iHWElTSSo 



18 



[0 5] i 1 tSL/cSfPtSIi$ta^xf^f-f' 
[0 6] 0 1 k:^LftB«IJaE#SaS|JtuJ;5#8a/^^— 

iscommm^tm j -r s ft *& ©ms jh ^ 0 

[07] 0 1 K*LftB«S»«afflk:j;S/^>K«EC 
[08] 2 *>o, k*ififif^2ffli:m>B<kL 

ft m&*mm* % ft * £>sj mm t p> %> a 

[09] 0 1 i^L/:Sll3^fc^a^3iS»l 

g'jft mm-? z> ft feoMWHT fc s o 

[0io] *;l/h»J ^*o«fcS:ftB!wrsft«)Oltt 
[0 11] h U ^^ftfUfflbftSF^^IigijO 

[0 1 2] (Fisher) O^ffiftaMWTSft 

[013] 01 fc^Lftfaw^jwo^B^jiinft^-r^ 
m^<Dmmi 



1 0 

1 1 

1 2 
1 3 
1 4 
1 5 
1 6 
A, 



m 



*-r ju- 



[02] 



[08] 



@HfcA## -11 



L 



•13 



.10 



ft 



□ □ 



□ a 



□ 
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CH4] 



(a) 



x2 



[0 10] 

b' 

A 1 



L2 



9 f\9 p 



jL 



(b) 




□ mo □ no a 



(a) 



[111] 




Airzfij-A 




-L4 mmimm 



Iff 
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S501 



CFLG[x]£-f<TOFF. 




*-^BC0CFLG[x]=OFFT* -5 

v>7M i Attars. 




, S504 

*"xCD5£#dBtf>;£&a<\" W \^ No 
? 



B«B-W*=K«ft*. 

cflg* r^r oFFtzMr„ 

count =0 



CFLG{xJ=ON 
count=count+1 




CFLG £ T TOFFICMT „ 
count=0 




CH9] 



\ 



r : , 



CH63 



(a) 



□ n 



0<P "^V 

a a &n c 



o„ □ □ 



(b) 



□ 

□ a a 



□ a 

o c CbQ 

% Q □ 
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